-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
K8s-9372: Add modification for custom policy #57
Conversation
// CustomMachineSetDeletePolicy prioritizes both Machines that have the annotation | ||
// "cluster.k8s.io/delete-machine=yes" and Machines that are unhealthy | ||
// (Status.ErrorReason or Status.ErrorMessage are set to a non-empty value). | ||
// If then prioritizes the machine with Status different from Ready. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe say explicitly that it prioritizes new machines over old ones for deletion
@@ -67,6 +76,34 @@ func oldestDeletePriority(machine *v1alpha1.Machine) deletePriority { | |||
return deletePriority(float64(mustDelete) * (1.0 - math.Exp(-d.Seconds()/secondsPerTenDays))) | |||
} | |||
|
|||
|
|||
func customDeletePolicy(machine *v1alpha1.Machine) deletePriority { | |||
// Iterate through conditions to get if machine is not "Ready" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because we always return mustDelete
, there's no real order to the machines with that value.
That's something we should eventually improve; for now maybe add a comment.
However, maybe reorder these checks already:
- deletion timestamp non zero
- machine has no nodeRef -> still provisioning
- deletion annotation
- node status NotReady
- error status
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I'll "split" the NodeReady/Conditions report to make machine selection more granular.
return mustDelete | ||
} | ||
// If annotations are set + DeleteAnnotation is present | ||
if machine.ObjectMeta.Annotations != nil && machine.ObjectMeta.Annotations[DeleteNodeAnnotation] != "" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unnecessary nil check:
if machine.ObjectMeta.Annotations != nil && machine.ObjectMeta.Annotations[DeleteNodeAnnotation] != "" { | |
if v, ok := machine.ObjectMeta.Annotations[DeleteNodeAnnotation]; ok && v != "" { |
// Thinking about this, maybe it's a good idea to delegate in oldestDeletePriority, like | ||
// newestDeletePriority is doing if no match is done, or just return couldDelete | ||
return mustDelete - oldestDeletePriority(machine) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense like this.
@squ94wk I think that with this changes we'll have the expected delete policy. The only change that is pending (and requires some extra changes on MC) is to get the updated status from Machine (not Node). Currently, the machine can be powered off or Kubelet stopped, and the status still shows as "Ready": Machine: test-mcontroller-65f466779d-x2mgb - Condition: {Ready True 0001-01-01 00:00:00 +0000 UTC 0001-01-01 00:00:00 +0000 UTC } - Status: True - Type: Ready Info: Machine qj6hz was off and Status/Type still reports the Ready: True when a downscale was triggered from UI from 2 machines to 1. The conditions were put in expected order, and PR comments applied to improve code. Will check with some extra tests, but for now it looks fine. |
What this PR does / why we need it:
This PR is required by K8s-9372 to add a new DeletePolicy that allows us to remove machines that are in a different status than "Ready" before checking the rest of the cases/situations, like newer or older machines.
Which issue(s) this PR fixes:
Fixes #K8s-9372